Skip to content

Fix hdrHeap/hdrStrHeap allocator inuse metric underflow#13218

Merged
moonchen merged 2 commits into
apache:masterfrom
moonchen:worktree-hdrheap-inuse-underflow
Jun 16, 2026
Merged

Fix hdrHeap/hdrStrHeap allocator inuse metric underflow#13218
moonchen merged 2 commits into
apache:masterfrom
moonchen:worktree-hdrheap-inuse-underflow

Conversation

@moonchen

@moonchen moonchen commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Problem

With ENABLE_ALLOCATOR_METRICS=ON, proxy.process.allocator.inuse.hdrHeap
and proxy.process.allocator.inuse.hdrStrHeap go deeply negative —
observed wrapping around as a huge uint64 value (displayed ~ -5.8e10).
Every other allocator's inuse stays sane.

Root cause

hdrHeapAllocator and strHeapAllocator are plain Allocator globals,
and their objects are allocated with no constructor arguments
(THREAD_ALLOC(hdrHeapAllocator, this_ethread()), then a manual
init()/placement-new). With no extra args, THREAD_ALLOC resolves to
the non-templated overload
thread_alloc(Allocator &, ProxyAllocator &) in ProxyAllocator.cc
rather than the templated thread_alloc(CAlloc &, ProxyAllocator &, Args &&...)
in ProxyAllocator.h.

The two overloads diverged:

  • The templated overload (used by every ClassAllocator<> allocator —
    IO buffers, sessions, etc.) calls a.increment_for_alloc() when it
    reuses an object from the per-thread freelist.
  • The non-templated overload did not.

Meanwhile THREAD_FREE always decrements inuse (UPDATE_FREE_METRICS)
when an object is pushed onto the freelist. So for these two allocators,
every reuse-from-freelist decremented inuse with no matching increment,
and the counter marched negative until it wrapped. Only hdrHeap and
hdrStrHeap hit the non-templated overload, which is why only they
underflowed.

Fix

Add the missing increment_for_alloc() to the freelist-reuse path of the
non-templated thread_alloc, mirroring the templated overload, so the two
paths stay symmetric. Guarded by TS_USE_ALLOCATOR_METRICS.

Test

Added a regression test in test_HdrHeap.cc (guarded by
TS_USE_ALLOCATOR_METRICS) that enables the per-thread freelist and cycles
a single hdrHeap/hdrStrHeap block through free→reuse 1000 times,
asserting inuse returns to baseline+1 and never dips below the starting
value.

Before the fix the test fails as inuse reaches -999/-1000
(reproducing the underflow); after the fix it passes. The full
test_proxy_hdrs suite is green.

🤖 Generated with Claude Code

The hdrHeap and hdrStrHeap allocators are plain Allocator globals whose
objects are allocated with no constructor arguments, so THREAD_ALLOC
resolves to the non-templated thread_alloc(Allocator &, ProxyAllocator &)
overload. Unlike the templated overload, it did not call
increment_for_alloc() when reusing an object from the per-thread
freelist, while THREAD_FREE always decremented inuse on the way in.

The counted frees therefore outran the counted allocs and
proxy.process.allocator.inuse.{hdrHeap,hdrStrHeap} marched negative,
wrapping around as a huge uint64 value. Other allocators use
ClassAllocator (the templated path) and were unaffected.

Account for the freelist reuse in the non-templated overload so the two
paths stay symmetric, and add a regression test that cycles a block
through the freelist and asserts inuse stays balanced and never dips
below its starting value.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 1, 2026 19:20
@moonchen moonchen self-assigned this Jun 1, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes allocator inuse metric underflow for hdrHeap / hdrStrHeap when TS_USE_ALLOCATOR_METRICS is enabled by making the non-templated thread_alloc(Allocator &, ProxyAllocator &) update metrics consistently with the templated overload, and adds a regression unit test to prevent recurrence.

Changes:

  • Update the non-templated thread_alloc(Allocator &, ProxyAllocator &) freelist-reuse path to call increment_for_alloc() under TS_USE_ALLOCATOR_METRICS.
  • Add a Catch2 regression test that repeatedly frees/reuses a single block via the per-thread freelist and asserts inuse remains balanced for hdrHeap and hdrStrHeap.
  • Add metrics-related includes and helpers in the HdrHeap unit test (guarded by TS_USE_ALLOCATOR_METRICS).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/iocore/eventsystem/ProxyAllocator.cc Fixes allocator metrics accounting symmetry on freelist reuse for the non-templated thread_alloc.
src/proxy/hdrs/unit_tests/test_HdrHeap.cc Adds a regression test to validate inuse remains stable across freelist free→reuse cycles for hdr heap allocators.

Comment thread src/proxy/hdrs/unit_tests/test_HdrHeap.cc
@moonchen moonchen added this to the 10.2.0 milestone Jun 1, 2026
A failing REQUIRE()/CHECK() throws and would skip the manual restore of
the cmd_disable_pfreelist harness flag, contaminating later tests in the
same process. Use an RAII guard that preserves and restores the original
int value on scope exit.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bryancall bryancall self-requested a review June 1, 2026 21:46
bryancall
bryancall previously approved these changes Jun 9, 2026

@bryancall bryancall left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(duplicate — disregard; see my other approval)

@bryancall bryancall left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve. Correctly fixes the inuse underflow: the non-templated thread_alloc freelist-reuse path was missing the increment_for_alloc() that THREAD_FREE always pairs with, so frees outran allocs and inuse.{hdrHeap,hdrStrHeap} wrapped. Restores symmetry with the templated overload; regression test (RAII-guarded cmd_disable_pfreelist) is solid. CI green.

@bryancall bryancall dismissed their stale review June 9, 2026 17:57

Duplicate review double-posted by tooling — dismissing; the identical approval below stands.

@moonchen moonchen merged commit 113f5fb into apache:master Jun 16, 2026
15 checks passed
@github-project-automation github-project-automation Bot moved this to For v10.2.0 in ATS v10.2.x Jun 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: For v10.2.0

Development

Successfully merging this pull request may close these issues.

4 participants